Discovering closed and maximal embedded patterns from large tree data

نویسندگان

چکیده

We address the problem of summarizing embedded tree patterns extracted from large data trees. do so by defining and mining closed maximal unordered a single tree. design an frequent pattern algorithm extended with local closedness checking technique. This is called {\em closedEmbTM-prune} as it eagerly eliminates non-closed patterns. To mitigate generation intermediate patterns, we devise search space pruning rules to proactively detect prune branches in which not correspond The are accommodated into miner produce new algorithm, closedEmbTM-prune}, for all Our extensive experiments on synthetic real large-tree datasets demonstrate that, dense datasets, only generates complete set substantially smaller than that generated miner, but also runs much faster negligible overhead pruning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Frequent Tree Patterns over Data Streams

Since tree-structured data such as XML files are widely used for data representation and exchange on the Internet, discovering frequent tree patterns over tree-structured data streams becomes an interesting issue. In this paper, we propose an online algorithm to continuously discover the current set of frequent tree patterns from the data stream. A novel and efficient technique is introduced to...

متن کامل

Efficient Mining of Closed Flock Patterns from Large Trajectory Data

In this paper, we study the closed pattern mining problem for a class of spatio-temporal patterns, called closed (k, r)-flock patterns in trajectory databases. A (k, r)-flock pattern (Gudmundsson and van Kreveld, 2006) represents a set of moving objects traveling close each other within radius r during time period of length k. Based on the notion of the envelope for a flock pattern, we introduc...

متن کامل

A SAT-Based Approach for Discovering Frequent, Closed and Maximal Patterns in a Sequence

In this paper we propose a satisfiability-based approach for enumerating all frequent, closed and maximal patterns with wildcards in a given sequence. In this context, since frequency is the most used criterion, we introduce a new polynomial inductive formulation of the cardinality constraint as a Boolean formula. A nogoodbased formulation of the anti-monotonicity property is proposed and dynam...

متن کامل

Efficient Algorithms for Discovering Frequent and Maximal Substructures from Large Semistructured Data

In this paper, we review recent advances in efficient algorithms for semi-structured data mining , that is, discovery of rules and patterns from structured data such as sets, sequences, trees, and graphs. After introducing basic definitions and problems, We present efficent algorithms for frequent and maximal pattern mining for classes of sets, sequences, and trees. In particular, we explain ge...

متن کامل

Discovering Frequent Embedded Subtree Patterns from Large Databases of Unordered Labeled Trees

Recent years have witnessed a surge of research interest in knowledge discovery from data domains with complex structures, such as trees and graphs. In this paper, we address the problem of mining maximal frequent embedded subtrees which is motivated by such important applications as mining “hot” spots of Web sites from Web usage logs and discovering significant “deep” structures from tree-like...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data and Knowledge Engineering

سال: 2021

ISSN: ['1872-6933', '0169-023X']

DOI: https://doi.org/10.1016/j.datak.2021.101890